Improved tools for biological sequence comparison.

نویسندگان

  • W R Pearson
  • D J Lipman
چکیده

We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimizing the total tardiness and makespan in an open shop scheduling problem with sequence-dependent setup times

We consider an open shop scheduling problem with setup and processing times separately such that not only the setup times are dependent on the machines, but also they are dependent on the sequence of jobs that should be processed on a machine. A novel bi-objective mathematical programming is designed in order to minimize the total tardiness and the makespan. Among several mult...

متن کامل

Mathematical Models , Algorithms , and Statistics of Sequence Alignment

The problem of biological sequence comparison arises naturally in an attempt to explain many biological phenomena. Due to the combinatorial structure and pattern preserving properties of the sequences it has attracted not only biologists, but also mathematicians, statisticians and computer scientists. In this work we study one of the most effective tools widely used for comparison of biological...

متن کامل

A Parallel Non-Alignment Based Approach to Efficient Sequence Comparison using Longest Common Subsequences

Biological sequence comparison programs have revolutionized the practice of biochemistry, molecular and evolutionary biology. Pairwise comparison is the method of choice for many computational tools developed to analyze the deluge of genetic sequence data. Unfortunately, a comprehensive study of the strengths and weaknesses of different alignment algorithms applied to different biological probl...

متن کامل

iProsite: an improved prosite database achieved by replacing ambiguous positions with more informative representations

PROSITE database contains a set of entries corresponding to protein families, which are used to identify the family of a protein from its sequence. Although patterns and profiles are developed to be very selective, each may have false positive or negative hits. Considering false positives as items that reduce the selectiveness of a pattern, then, the more selective pattern we have, a more accur...

متن کامل

Molecular Docking and In Silico Study of Denileukin Diftitox: Comparison of Wild Type With C519S-Mutant

Background: Denileukin diftitox (trade name, Ontak) is the first recombinant immunotoxin (IM), in which the binding domain of diphtheria toxin has been replaced by the amino acid sequence of human interleukin-2 (DT389IL-2) using genetic engineering. Purity, stability, and structural property of the protein are critical factors for the scale-up production of this fusion protein. In this IM, loca...

متن کامل

A bioinformatics approach to 2D shape classification

In the past, the huge and profitable interaction between Pattern Recognition and biology/bioinformatics was mainly unidirectional, namely targeted at applying PR tools and ideas to analyse biological data. In this paper we investigate an alternative approach, which exploits bioinformatics solutions to solve PR problems: in particular, we address the 2D shape classification problem using classic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings of the National Academy of Sciences of the United States of America

دوره 85 8  شماره 

صفحات  -

تاریخ انتشار 1988